> 3. Rootfinding 


Rootfinding for Nonlinear Equations 


> 3. Rootfinding 


Calculating the roots of an equation 
f(x) =0 (7.1) 


is a common problem in applied mathematics. 


We will 
e explore some simple numerical methods for solving this equation, 
and also will 


o consider some possible difficulties 


> 3. Rootfinding 


The function f(x) of the equation (7.1) 
9 will usually have at least one continuous derivative, and often 
ə we will have some estimate of the root that is being sought. 


By using this information, most numerical methods for (7.1) compute 
a sequence of increasingly accurate estimates of the root. 
These methods are called iteration methods. 


We will study three different methods 
Q the bisection method 
Q Newton's method 
Q secant method 


and give a general theory for one-point iteration methods. 


> 3. Rootfinding > 3.1 The bisection method 


In this chapter we assume that f : R —> R i.e., 
f (x) is a function that is real valued and that z is a real variable. 
Suppose that 


9 f(x) is continuous on an interval [a,b], and 
o 


f(a) f(b) < 0 (7.2) 


Then f(a) changes sign on [a,b], and f(x) = 0 has at least one root on the 
interval. 


Definition 


The simplest numerical procedure for finding a root is to repeatedly halve the 
interval [a, b], keeping the half for which f(a) changes sign. This procedure is 
called the bisection method, and is guaranteed to converge to a root, 

denoted here by a. 


> 3. Rootfinding > 3.1 The bisection method 


Suppose that we are given an interval [a,b] satisfying (7.2) and an error 
tolerance £ > 0. 
The bisection method consists of the following steps: 


B1 Define c — ab 
B2 If b—c<e, then accept c as the root and stop. 


B3 If sign[f (b)] - sign[ f(c)] € 0, then set a = c. 
Otherwise, set b = c. Return to step B1. 


The interval [a,b] is halved with each loop through steps B1 to B3. 
The test B2 will be satisfied eventually, and with it the condition |a — c| < € 
will be satisfied. 


Notice that in the step B3 we test the sign of sign[f (b)] - sign[f(c)] in order 
to avoid the possibility of underflow or overflow in the multiplication of f(b) 


and f(c). 


> 3. Rootfinding > 3.1 The bisection method 


Find the largest root of 
f(a) =2° —cz-1=0 (7.3) 
accurate to within € = 0.001. 


With a graph, it is easy to check that 1 < a < 2 
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> 3. Rootfinding > 3.1 The bisection method 


The results of the algorithm B1 to B3: 


n a b c b—c | flo) 

1 | 1.0000 | 2.0000 | 1.5000 | 0.5000 | 8.8906 
2 | 1.0000 | 1.5000 | 1.2500 | 0.2500 | 1.5647 
3 | 1.0000 | 1.2500 | 1.1250 | 0.1250 | -0.0977 
4 | 1.1250 | 1.2500 | 1.1875 | 0.0625 | 0.6167 
5 | 1.1250 | 1.1875 | 1.1562 | 0.0312 | 0.2333 
6 | 1.1250 | 1.1562 | 1.1406 | 0.0156 | 0.0616 
7 | 1.1250 | 1.1406 | 1.1328 | 0.0078 | -0.0196 
8 | 1.1328 | 1.1406 | 1.1367 | 0.0039 | 0.0206 
9 | 1.1328 | 1.1367 | 1.1348 | 0.0020 | 0.0004 
10 | 1.1328 | 1.1348 | 1.1338 | 0.00098 | -0.0096 


Table: Bisection Method for (7.3) 
The entry n indicates that the associated row corresponds to iteration 


number n of steps B1 to B3. 


> 3. Rootfinding > 3.1.1 Error bounds 


Let an, bn and cn denote the nt” computed values of a,b and c: 


1 
bn+1 — ün+1 = 5 (bn m an), n> 1 


and 
1 
= 9n—1 


(b — a) (7.4) 


b, — An 


where b — a denotes the length of the original interval with which we started. 
Since the root o € |an, Cn] or a € [Cn, bn], we know that 


1 
o: — en] S Cn — an = bn — Cn = 5 (ba — an) (7.5) 


This is the error bound for cn that is used in step B2. 
Combining it with (7.4), we obtain the further bound 


1 
la — cn| < mA a). 


This shows that the iterates Cn — @ as n — oo. 


> 3. Rootfinding > 3.1.1 Error bounds 


To see how many iterations will be necessary, suppose we want to have 
ja — cn| < € 
This will be satisfied if 


For the previous example (7.3), this results in 


> log (zr) = 9.97 
— log2 


i.e., we need n — 10 iterates, exactly the number computed. 


> 3. Rootfinding > 3.1.1 Error bounds 


There are several advantages to the bisection method 
@ It is guaranteed to converge. 


@ The error bound (7.5) is guaranteed to decrease by one-half with each 
iteration 


Many other numerical methods have variable rates of decrease for the error, 
and these may be worse than the bisection method for some equations. 


The principal disadvantage of the bisection method is that 


@ generally converges more slowly than most other methods. 


For functions f(x) that have a continuous derivative, other methods are 
usually faster. These methods may not always converge; when they do 
converge, however, they are almost always much faster than the bisection 
method. 


> 3. Rootfinding > 3.2 Newton's method 


Figure: The schematic for Newton's method 


> 3. Rootfinding > 3.2 Newton's method 


There is usually an estimate of the root a, denoted xo. 

To improve it, consider the tangent to the graph at the point 

(xo, f (xo)). 

If zo is near o; then the tangent line ~ the graph of y = f(x) for 
points about a. 

Then the root of the tangent line should nearly equal o, denoted 71. 


> 3. Rootfinding > 3.2 Newton's method 


The line tangent to the graph of y = f(x) at (xo, f(xo)) is the graph 
of the linear Taylor polynomial: 


pi(x) = f (xo) + f (zo) (a — xo) 
The root of pi(z) is zi: 


f (xo) + f'(zo)(z1 — zo) =0 


f (xo) 


v1, ZT 


° fo) 


> 3. Rootfinding > 3.2 Newton's method 


Since x1 is expected to be an improvement over xp as an estimate of a, we 
repeat the procedure with x; as initial guess: 


€ f) 
f' (21) 
Repeating this process, we obtain a sequence of numbers, iterates, 
£1, £2, £3, ... hopefully approaching the root a. 
The iteration formula 
& = In — =, $0 12... 7.6 
n+1 n f! (xn) , ( ) 


is referred to as the Newton's method, or Newton-Raphson, for solving 


f(x) — 0. 


> 3. Rootfinding > 3.2 Newton's method 


Using Newton's method, solve (7.3) used earlier for the bisection method. 


Here 


and the iteration 


zÊ — ön —1 


fames e n>0 (7.7) 


The true root is a = 1.134724138, and xg = a to nine significant digits. 


Newton's method may converge slowly at first. However, as the iterates come 
closer to the root, the speed of convergence increases. 


> 3. Rootfinding > 3.2 Newton's method 


newton.m 


n Zn Fian) | n= anr | €—£n.1 
0} 1.5 8.89E+1 

1 | 1.30049088 2.54E+1 -2.00E-1 -3.65E-1 
2 | 1.18148042 | 5.388E-1 | -119E-1 | -1.66E-1 
3 | 113945559 | 4.92E—2 | -420E- | -4.68E-2 
4 | 1.13477763 5.50E—4 -4.68E-3 -4.73E-3 
5 | 1.13472415 | 7.11E—-8 | -5.35E-5 | -5.35E-5 
6 | 1.13472414 | L55E-15 | -6.91E-9 | -6.91E-9 

1.134724138 


Table: Newton's Method for xê —x—1=0 


Compare these results with the results for the bisection method. 


> 3. Rootfinding > 3.2 Newton's method 


One way to compute £ on early computers (that had hardware arithmetic for 
addition, subtraction and multiplication) was by multiplying a and +, with i 
approximated by Newton's method. 


1 
f(z) =b- = 0 
where we assume b > 0. The root is a = i the derivative is 
roes 
and Newton's method is given by 
b— + 
En+1 = In — l5 


Ladi = fal? — bzy) m0 (7.8) 


> 3. Rootfinding > 3.2 Newton's method 


This involves only multiplication and subtraction. 
The initial guess should be chosen xo > 0. 
For the error it can be shown 


Rel(2n41) = [Rel(z,)],  n >0 (7.9) 


where 


Q — Tn 


Rel(x4) = 
el(tn) = = 
the relative error when considering £n as an approximation to œ = 1/b. From 
(7.9) we must have 

|Rel(xo)| < 1 


Otherwise, the error in x, will not decrease to zero as n increases. 
This contradiction means 
1 
-1 < 4—— «1 


equivalently 


2 
0<a <5 (7.10) 


> 3. Rootfinding > 3.2 Newton's method 


The iteration (7.8), %41 = %,(2 — bzn), n > 0, converges to a = t if and 
only if the initial guess xo satisfies 
0 < zo < 2 


P tolo) | 


Figure: The iterative solution of b — 1 — 0 


> 3. Rootfinding > 3.2 Newton's method 


If the condition on the initial guess is violated, the calculated value of 
xı and all further iterates would be negative. 


The result (7.9) shows that the convergence is very rapid, once we 
have a somewhat accurate initial guess. 


For example, suppose | Rel(zo)| = 0.1, which corresponds to a 10% 
error in zo. Then from (7.9) 
Rel(zi) = 10-2, Rel(x2) = 1074 


7.11 
Rel(z3) = 1078, Rel(x4) = 10716 Co 


Thus, x3 or x4 should be sufficiently accurate for most purposes. 


> 3. Rootfinding > 3.2.1 Error Analysis 


Assume that f € C? in some interval about the root a, and 


f'(a) #0, (7.12) 


i.e., the graph y = f(x) is not tangent to the z-axis when the graph 
intersects it at x = a. The case in which f'(a) = 0 is treated in Section 3.5. 
Note that combining (7.12) with the continuity of f'(x) implies that 

f'(x) z 0 for all x near a. 

By Taylor's theorem 


f(a) = Fen) + (a an) J (03) + Zla — n)? f" (os) 


with cn an unknown point between a and Zp. 
Note that f(a) = 0 by assumption, and then divide by f’(x,,) to obtain 


NECS » f" (es) 
F'@n) 


+a — Tn + (a — zn) 


2f’ (En) 


> 3. Rootfinding > 3.2.1 Error Analysis 


Solving for o, — 41, we have 


ES (713) 


2f'(zn) 


This formula says that the error in £n+1 is nearly proportional to the 
square of the error in £n. 

When the initial error is sufficiently small, this shows that the error in 
the succeeding iterates will decrease very rapidly, just as in (7.11). 


Q— Zg41 = (a — En) | 


Formula (7.13) can also be used to give a formal mathematical proof 
of the convergence of Newton's method. 


> 3. Rootfinding > 3.2.1 Error Analysis 


6 = a 
For the earlier iteration (7.7), i.e., z441 = Zn — Me > 0, we have 
f" (a) = 30x4. If we are near the root a, then 


zc EMO ae 
TIN = -2.42 


~ 2f(o)  2(605— 1) 
Thus for the error in (7.7), 


a — £441 S —2.42(a — z4)? (7.14) 


This explains the rapid convergence of the final iterates in table. 
For example, consider the case of n = 3, with a — 23 = —.73E — 3. Then 
(7.14) predicts 


a — 14 = 2.42(4.73E — 3? = —5.42E — 5 


which compares well to the actual error of a — z4 = 5.35E — 5. 


> 3. Rootfinding > 3.2.1 Error Analysis 


If we assume that the iterate x, is near the root a, the multiplier on the RHS 


of (7.13), i.e., a — z441 = (a — Lp)? ELI can be written as 


og ET 
Deja = ue 
Thus, 
a — £441 ^? M(a — £n)’, n>0 
Multiply both sides by M to get 
M(a — 2541) & [M (a = z4)]? 


Assuming that all of the iterates are near a, then inductively we can show that 


M(a—-z,)m|[M(o—-zo]?, n>0 


> 3. Rootfinding > 3.2.1 Error Analysis 


Since we want a — £n to converge to zero, this says that we must have 


[|M(a — zo)| < 1 


1 2f'(o) 


iu] 7 | Fa) en) 


la — zo| < 


If the quantity |M] is very large, then zo will have to be chosen very close to 
a to obtain convergence. In such situation, the bisection method is probably 


an easier method to use. 
The choice of x can be very important in determining whether 
Newton's method will converge. 


Unfortunately, there is no single strategy that is always effective in choosing Zo. 


9 |n most instances, a choice of zo arises from physical situation that led 
to the rootfinding problem. 


9 In other instances, graphing y = f(x) will probably be needed, possibly 
combined with the bisection method for a few iterates. 


> 3. Rootfinding > 3.2.2 Error estimation 


We are computing sequence of iterates £n, and we would like to estimate 
their accuracy to know when to stop the iteration. 
To estimate o — £n, note that, since f(a) = 0, we have 


f(@n) = f(n) — F(a) = f’(En) (an — o) 


for some €,, between x,, and a, by the mean-value theorem. 
Solving for the error, we obtain 


= —f(z«) re -f(za) 
FG) — f(x) 


provided that £n is so close to a that f’(x,) = f'(£,). From the 


Newton-Raphson method (7.6), i.e., p41 = Zn — Z4. this becomes 


Q — Tn 


Qt — En £2 Xn41 — Tn (7.17) 


This is the standard error estimation formula for Newton's method, and it is 
usually fairly accurate. 


However, this formula is not valid if f'(a) = 0, a case that is discussed in Section 3.5. 


> 3. Rootfinding > 3.2.2 Error estimation 


Consider the error in the entry x3 of the previous table. 


Q—433 = -—4.73E -3 
z4 — z3 = —4.68E —3 


This illustrates the accuracy of (7.17) for that case. 


> 3. Rootfinding > Error Analysis - linear convergence 


Use Newton's Method to find a root of f(x) = z?. 


2 
Enpi = En- Pay = tn — Ze, = 
So the method converges to the root o = 0, but the convergence is only linear 
en 
En+1 = p 


Use Newton's Method to find a root of f(x) = x^". 


m 
Ty m-—1 


MEm—1 m 


The method converges to the root o = 0, again with linear convergence 


Xn] = Xn — 


m-—1 


En+1 = En. 


m 


> 3. Rootfinding > Error Analysis - linear convergence 


Assume f € C"'*![a, b] and has a multiplicity m root a. Then Newton's 
Method is locally convergent to a, and the absolute error e,, satisfies 


ie eee De (7.18) 


> 3. Rootfinding > Error Analysis - linear convergence 


Find the multiplicity of the root a = 0 of f(x) = sinz + x? cosa — 2? — 2, 


and estimate the number of steps in NM for convergence to 6 correct decimal 
places (use zo = 1). 


(£) =sinz+2?cosx—a2?-2@ => f(0) =0 
f'(x) = cosg + 2g cosg -— zr? sing — 24-1 => f'(0)=0 
f'(x) = —sinz--2cosz -— 4rsin g — z?cosr—2 = f"(0)—0 
f” (x) = —cosz — 6sinz — 6x cos x + z? sin x => f"(0)--1 


2 


Hence a = 0 is a triple root, m = 3; so en41 & fen. 
Since eg = 1, we need to solve 


2N” logn .5— 6 
(3) <05x10-°, n> LEL? y 35.78. 


3 


log;o 2/3 


> 3. Rootfinding > Modified Newton’s Method 


If the multiplicity of a root is known in advance, convergence of Newton's 
Method can be improved. 


Assume f € C™+t[a,b] which contains a root o of multiplicity m > 1. Then 
Modified Newton's Method 


m Jn) 


Bp) =p m fs.) (7.19) 


converges locally and quadratically to a. 


Proof. MNM: mf (an) = (En — £n41) f (za). 
Taylor's formula: 
Q = (s, Wu MCN 
= REM f(r, DE u 14) + + f'(c e Za)? 


t (a — zn) f 
= Stet f'(a, ji bi no. Ve sg 


> 3. Rootfinding > Nonconvergent behaviour of Newton's Method 


Apply Newton's Method to f(x) = —a* + 3x? +2 with starting guess zo = 1. 


The Newton formula is 
4 2 
e —a° +307, +2 
ntl = Tn — —423 +62, ' 


which gives 


zı = —1, Te 15. 


> 3. Rootfinding > Nonconvergent behaviour of Newton's Method 


Failure of Newton"s Method for =f + 3x° + 2-0 


> 3. Rootfinding > 3.3 The secant method 


The Newton method is based on approximating the graph of y = f(a) with a 
tangent line and on then using a root of this straight line as an approximation 
to the root a of f(x). 

From this perspective, 

other straight-line approximation to y — f(x) would also lead to methods of 
approximating a root of f(x). One such straight-line approximation leads to 
the secant method. 


Assume that two initial guesses to œ are known and denote them by 
Zo and xı. 
They may occur 


9 on opposite sides of o, or 


9 on the same side of a. 


> 3. Rootfinding > 3.3 The secant method 


Figure: A schematic of the secant method: x1 < o < xo 


> 3. Rootfinding > 3.3 The secant method 


Figure: A schematic of the secant method: a < 21 < xo 


> 3. Rootfinding > 3.3 The secant method 


To derive a formula for 72, we proceed in a manner similar to that used to 
derive Newton's method: 

Find the equation of the line and then find its root z2. 
The equation of the line is given by 


y = p(x) = f(x1) + (x — z1); 


Solving p(z3) = 0, we obtain 


f(w1) = f(xo) 


Tı — To 


zı — To 
f (x1) — f (z0) 
Having found a2, we can drop xg and use x1, 22 as a new set of approximate 
values for a. This leads to an improved values z3; and this can be continued 
indefinitely. Doing so, we obtain the general formula for the secant method 


z2 = 41 — f(t1): 


Lyn — Xn—1i 


Ff (en) — F(@n—1)’ 


It is called a two-point method, since two approximate values are needed to 


n> 1. (7.20) 


Ln+1 = Xn 


obtain an improved value. The bisection method is also a two-point method, 
but the secant method will almost always converge faster than bisection. 


> 3. Rootfinding > 3.3 The secant method 


> 3. Rootfinding > 3.3 The secant method 


secant.m 


We solve the equation f(x) = xê — z — 1 = 0. 


n Tn Tn — Tn—1 Q — Tn—1 
0 | 2.0 61.0 

1 | 1.0 -1.0 -1.0 

2 | 1.01612903 | -9.15E-1 | 1.61E-2 1.35E-1 
3 | 1.19057777 | 6.57E-1 | 1.74E-1 1.19E-1 
4 | 1.11765583 | -1.68E-1 | -7.29E-2 -5.59E-2 
5 | .113253155 | -2.24E-2 | -2.24E2 1.71E-2 
6 | 1.13481681 | 9.54E-4 | 2.29E-3 2.19E-3 
7 | 1.13472365 | -5.07E-6 | -9.32E-5 -9.27E-5 
8 | 1.13472414 | -1.13E-9 | 4.92E-7 4.92E-7 


The iterate xg equals o rounded to nine significant digits. 


As with the Newton method (7.7) for this equation, the initial iterates do not 
converge rapidly. But as the iterates become closer to a, the speed of 


convergence increases. 


= 


3. Rootfinding > 3.3.1 Error Analysis 


By using techniques from calculus and some algebraic manipulation, it 
is possible to show that the iterates x, of (7.20) satisfy 


— f" (En) 
2f' (64) 


The unknown number Cn is between £n and x41, and the unknown 
number £, is between the largest and the smallest of the numbers 

Q, £n and z4 4. The error formula closely resembles the Newton error 
formula (7.13). This should be expected, since the secant method can 
be considered as an approximation of Newton's method, based on using 


f' (an) = f (xn) — F(tn—-1) 


Ln — n-1 


Q — $4441 = (A — Zn) (Qa — x41) (7.21) 


Check that the use of this in the Newton formula (7.6) will yield (7.20). 


> 3. Rootfinding > 3.3.1 Error Analysis 


The formula (7.21) can be used to obtain the further error result that 
if zy and zı are chosen sufficiently close to a, then we have 
convergence and 


gs e-ma |I) PO 
noo |æ = E  |2f(o)|  — 
where r — MEM — 1.62. Thus, 
la — £544| S cla — £n] £? (7.22) 


as £n approaches a. Compare this with the Newton estimate (7.15), in 
which the exponent is 2 rather then 1.62. Thus, Newton's method 
converges more rapidly than the secant method. Also, the constant c 
in (7.22) plays the same role as M in (7.15), and they are related by 


c= |M. 


The restriction (7.16) on the initial guess for Newton's method can be 
replaced by a similar one for the secant iterates, but we omit it. 


> 3. Rootfinding > 3.3.1 Error Analysis 


Finally, the result (7.22) can be used to justify the error estimate 


Q — In-1 ~~ In — Ln-1 


for iterates x,, that are sufficiently close to the root. 


For the iterate x5 in the previous Table 


a = it —219F —3 


; (7.23) 
Le a = 2.29E — 3 


> 3. Rootfinding > 3.3.2 Comparison of Newton and Secant methods 


9 From the foregoing discussion, Newton's method converges more rapidly 
than the secant method. Thus, 
Newton’s method should require fewer iterations to attain a given 
error tolerance. 


9 However, Newton's method requires two function evaluations per 
iteration, that of f(x,) and f'(r,). And the secant method requires 
only one evaluation, f(a»), if it is programed carefully to retain the 
value of f(x; 1) from the preceding iteration. Thus, 
the secant method will require less time per iteration than the 
Newton method. 


The decision as to which method should be used will depend on the factors 
just discussed, including the difficulty or expense of evaluating f'(z,); 

and it will depend on intangible human factors, such as convenience of use. 
Newton's method is very simple to program and to understand; but for many 
problems with a complicated f'(x), the secant method will probably be faster 
in actual running time on a computer. 


> 3. Rootfinding > 3.3.2 Comparison of Newton and Secant methods 


The derivation of both the Newton and secant methods illustrate a general 
principle of numerical analysis. 


When trying to solve a problem for which there is no direct or simple method 
of solution, approximate it by another problem that you can solve more easily. 


In both cases, we have replaced the solution of 


f(x) =0 


with the solution of a much simpler rootfinding problem for a linear equation. 


GENERAL OBSERVATION 

When dealing with problems involving differentiable functions f(x), 
move to a nearby problem by approximating each such f(x) with a 
linear problem. 


The linearization of mathematical problems is common throughout applied 
mathematics and numerical analysis. 


> 3. Rootfinding > 3.3 The MATLAB function f zero 


MATLAB contains the rootfinding routine f zero that uses ideas 
involved in the bisection method and the secant method. As with 
many MATLAB programs, there are several possible calling sequences. 
9 The command 
root = fzero(f name, [a, 5]) 
produces a root within [a, b], where it is assumed that 
f(a)f(b) < 0. 
e The command 
root = fzero(f name, x0) 
tries to find a root of the function near x0. 


The default error tolerance is the maximum precision of the machine, 
although this can be changed by the user. 

This is an excellent rootfinding routine, combining guaranteed 
convergence with high efficiency. 


> 3. Rootfinding > Method of False Position 


There are three generalization of the Secant method that are also 
important. The Method of False Position, or Regula Falsi, is similar 
to the Bisection Method, but where the midpoint is replaced by a 
Secant Method-like approximation. Given an interval [a, b] that 
brackets a root (assume that f(a) f(b) < 0), define the next point 


o FO = af(0) 
f(a) - 10) 


as in the Secant Method, but unlike the Secant Method, the new point 
is guaranteed to lie in [a,b], since the points (a, f(a)) and (b, f(b)) lie 
on separate sides of the x-axis. The new interval, either [a,c] or [c, 0], 
is chosen according to whether f(a)f(c) « 0 or f(c)f(b) « 0, 
respectively, and still brackets a root. 


> 3. Rootfinding > Method of False Position 


Given interval [a,b] such that f(a) f(b) < 0 


for i = 1,2,3,... 
bf (a) = af) 
f(a) — f(b) 


if f(c) = 0, stop, end 
if f(a) f(c) <0 

b=c 
else 


a= 


end 


The Method of False Position at first appears to be an improvement on both 
the Bisection Method and the Secant Method, taking the best properties of 
each. However, while the Bisection method guarantees cutting the 
uncertainty by 1/2 on each step, False Position makes no such promise, and 
for some examples can converge very slowly. 


> 3. Rootfinding > Method of False Position 


Apply the Method of False Position on initial interval [-1,1] to find the 
root r — 1 of f(x) = 2? — 2x? + 2a. 


Given zo = —1, gı = 1 as the initial bracketing interval, we compute 
the new point 


zı f (xo) = zof (21) _ 1(—9/2) = (—1)1/2 = 4 
f (xo) — f(z1) -9/2 — 1/2 à 


T2 = 


Since f(—1)/(4/5) < 0, the new bracketing interval is 

[£o, 2] = [—1, 0.8]. This completes the first step. Note that the 
uncertainty in the solution has decreased by far less than a factor of 
1/2. As seen in the Figure, further steps continue to make slow 
progress toward the root at x = 0. 

Both the Secant Method and Method of False Position converge slowly 
to the root r = 0. 


> 3. Rootfinding > Method of False Position 


> 3. Rootfinding > Method of False Position 


> 3. Rootfinding > 3.4 Fixed point iteration 


e The Newton method (7.6) 


X 
mecs ME n=0,1,2,... 


e and the secant method (7.20) 


In — Xn-—1 
Xa = Tn — f (£n): n>1 
» " i F(a) — fna) 
are examples of one-point and two-point iteration methods, 
respectively. 


In this section we give a more general introduction to iteration 
methods, presenting a general theory for one-point iteration formulae. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Solve the equation 
x = g(x) 
for a root a = g(a) by the iteration 


Zo, 
Vaid = Oley), n=0,1,2,... 


Example: Newton's Method 


Definition 
The solution a is called a fixed point of g. 


The solution of f(a) = 0 can always be rewritten as a fixed point of g, e.g., 


x+ fle)=x£ = g(t)=2r4+ f(z). 


> 3. Rootfinding > 3.4 Fixed point iteration 


As motivational example, consider solving the equation 


az? —-5=0 (7.24) 


for the root a = v5 = 2.2361. 


We give four methods to solve this equation 


ll. tae = 5 + Tn — 22 r—z-c(a?—a)cz0 
I2. dr s gat 
In x 
l 2 2 
I3. nti =1+ ga — pra z-—cr-cc(zr|—a).4cz0 
1 5 1 


All four iterations have the property that if the sequence (zx, : n > 0} has a 
limit a, then a is a root of (7.24). For each equation, check this as follows: 
Replace £n and 2,41 by o, and then show that this implies a = +5. 


> 3. Rootfinding > 3.4 Fixed point iteration 


£n:ll z4:12 | zn 313 Tn :l14 | 
2.5 2.5 2.5 25 | 
| 
| 
| 


1.25 2.0 2.25 2.25 
4.6875 2.5 | 2.2375 | 2.2361 
-12.2852 | 2.0 | 2.2362 | 2.2361 


wi N| =| 9| 53 


Table: The iterations l1 to 14 
To explain these numerical results, we present a general theory for one-point 
iteration formulae. 
The iterations 11 to 14 all have the form 


Tn] = g(25) 


for appropriate continuous functions g(a). For example, with 11, 
g(x) — 5 4- x — x?. If the iterates z,, converge to a point a, then 


a = g(a) 


Thus a is a solution of the equation x = g(x), and o is called a fixed point of 
the function g. 


> 3. Rootfinding > 3.4 Fixed point iteration 


In this section, a general theory is given to explain when the iteration 
Lnt1 = f(x4) will converge to a fixed point of g. 
We begin with a lemma on existence of solutions of z — g(x). 


Let g € C[a, 6]. Assume that g([a, b]) C Ja, b], i.e., 
vz € [a,b], g(x) € [a,b]. 
Then z = g(x) has at least one solution a in the interval [a,b]. 


Proof. Define the function f(x) = a — g(x). It is continuous for a < x < b. 


Moreover, 
f(a) 2 a— g(a) <0 
f(b) = b— g(b) > 0 


Intermediate value theorem — 3x € [a,b] such that f(x) = 0, i.e. x = g(x). 


> 3. Rootfinding > 3.4 Fixed point iteration 


3.5 T T T T T 


3.5 


The solutions o are the x-coordinates of the intersection points of the 
graphs of y = x and y = g(x). 


> 3. Rootfinding > 3.4 Fixed point iteration 


Definition 


Given g : [a,b] — R, is called Lipschitz continuous with constant A > 0 
(denoted g € Lip, [a,b]) if 3A > 0 such that 
lg(v) - (y) SA =y] — Va,y € [a,b]. 


Definition 


g : [a,b] — R is called contraction map if g € Lip, [a,b] with A < 1. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Let g € Lipy|a, 6] with A < 1 and g([a,5]) C [a,b]. 
Then x = g(a) has exactly one solution œ. Moreover, for 
Int+1 = JEn), Ln — a for any zo € [a,b] and 


Proof. 
Existence: follows from previous Lemma. 
Uniqueness: assume Ja, B solutions: a = g(a), 8 = g(D). 


la — 8| = |9(@) — 9(8)] < Ala — BI 
(1 — A) |a - £| <0. 
—— 


20 


> 3. Rootfinding > 3.4 Fixed point iteration 


If £n € [a,b] then g(£n) = x&41 € [a, b]) > {£n}n>0 C [a,b]. 


a — Lp| = |g(a) — g(x4)| € Ala — zn-1| € ... € A” |a — zo| (7.25) 


[zo — a| = |£o — 21+ z1— | € [xo — z1i| + |z1 — | 


< |zo — z1| + Aļzo — a| 
LQ — LY 
1—A 


= |zo — a| < (7.26) 


|En — a| < A"|zo — a| < 


AP 
mp dias 


> 3. Rootfinding > 3.4 Fixed point iteration 


a — Tn| € |a — £ni| + |En+1 — Ln| < Ala — za] + |En — £n 


1 
|a = £n| S [len = wil 


la — n4i| € Ala — zal 


= | al E 


A 
qe mu In 


> 3. Rootfinding > 3.4 Fixed point iteration 


Assume g'(x) exists on [a,b]. By the mean value theorem: 
g(x) -s(y) 2g (€)—-v), €€ [a,b], Yz, y € [a,b] 
Define 
A= ‘(x)|. 
EA 


Then g € Lipy{a, b]: 


le(z) — g(y)| € Ig (Ola — y| € Ala — yl. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Theorem 2.6 

Assume g € C1 (a, b], g([a, 6]) C [a,b] and max,zejq, |g'(x)| < 1. Then 
Q x = g(x) has a unique solution o in [a,b], 
gu 2 Vzo € [a,b], 


a — £541 = g(a) — g(z«) 
= g'(En)(a — £n), fn € [o £n]. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Assume a solves x = g(x), g € C [Ia], for some Ta 2 a, |g’(a)| < 1. 
Then Theorem 2.6 holds for xo close enough to a. 


Proof. 
Since |g'(o))| < 1 by continuity 


\g'(x)| <1 for x € Ia = [a — £€,a + el. 
Take zo € Ia close to x1 € Ia 


|z1 — a| = [g(zo) — 9(o)| = |g'(£)(xo — a)| 
< |g'(£)||zo — a| < |xo - o] < € 


=> x1 € I, and, by induction, £n € Ig. 
= Theorem 2.6 holds with [a,b] = Iq. L] 


> 3. Rootfinding > 3.4 Fixed point iteration 


Importance of |g'(o)| < 1: 


If |g'(a)| > 1 : and £n is close to o then 


[En+1 — o] = |g (&x)||n — al 
> [zn — e] > ]zs — o] 
=> divergence 


When g'(a) — 1, no conclusion can be drawn; and even if convergence 
were to occur, the method would be far too slow for the iteration 
method to be practical. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Recall a = V5. 
@ g(x) 25-4 x — 22; g'(x) = 1— 2x, g(a) =1— 2v5. Thus 
the iteration will not converge to v5. 


Note that this is Newton's method for computing v5. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Recall a = V5. 
@ g(x) =54+2-27;,9'(x) = 1— 2x, g(a) = 1 — 2v5. Thus 
the iteration will not converge to v5. 


@ o(2)=5/2,d(0)=-3; g(a) = um m 


We cannot conclude that the iteration converges or diverges. 
From the Table, it is clear that the iterates will not converge to a. 


Note that this is Newton's method for computing v5. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Recall a = v5. 
@ g(x) =54+2-27;,9'(x) = 1— 2x, g(a) = 1 — 2v5. Thus 
the iteration will not converge to v5. 


@ o(2)=5/2,d(2)=-3; g(a) = um m 


We cannot conclude that the iteration converges or diverges. 
From the Table, it is clear that the iterates will not converge to a. 


© g(z) 214 z — tr’, g'(x) 2 1— 2z, g (o) = 1— 2V5 = 0.106, 
i.e., the iteration will converge. Also, 


|a — In41| ~ 0.106|a — zn] 


when zn is close to a. The errors will decrease by approximately a 
factor of 0.1 with each iteration. 


Note that this is Newton's method for computing v5. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Recall a = v5. 


@ g(x) =54+2-27;,9'(x) = 1— 2x, g(a) = 1 — 2v5. Thus 
the iteration will not converge to v5. 


@ o(2)=5/2,4(0)=-2; g(a) = um m 


We cannot conclude that the iteration converges or diverges. 
From the Table, it is clear that the iterates will not converge to a. 

© g(z) 214 z — tr’, g'(x) 2 1 — 2z, g (o) = 1— $V5 = 0.106, 
i.e., the iteration will converge. Also, 


|a — In41| ~ 0.106|a — zn] 


when zx, is close to a. The errors will decrease by approximately a 
factor of 0.1 with each iteration. 


O g(z) 2 3 (r3); g(oa) =0 convergence 
Note that this is Newton's method for computing v5. 


> 3. Rootfinding > 3.4 Fixed point iteration 


z = g(z), g(x) = x + c(z? — 3) 
What value of c will give convergent iteration? 
g(a) 21-4 2e 
a= v3 
Need |g'(a)| < 1 
Su wol 


Optimal choice: 14-2ce/320 = c= nun 


> 3. Rootfinding > 3.4 Fixed point iteration 


The possible behaviour of the fixed point iterates x, for various sizes 
of g' (a). 

To see the convergence, consider the case of zı = g(xo), the height of 
the graph of y = g(x) at zo. 

We bring the number zı back to the x-axis by using the line y = x 
and the height y = 71. 

We continue this with each iterate, obtaining a stairstep behaviour 
when g/(a) > 0. 


When g'(a) < 0, the iterates oscillate around the fixed point a, as can 
be seen. 


> 3. Rootfinding > 3.4 Fixed point iteration 


3.5 


xi x2 alpha 


Figure: 0 < g(a) <1 


> 3. Rootfinding > 3.4 Fixed point iteration 


3.5 


Figure: —1 < g'(a) <0 


> 3. Rootfinding > 3.4 Fixed point iteration 


Figure: 1 < g'(a) 


> 3. Rootfinding > 3.4 Fixed point iteration 


2.5 


0.5 


Figure: g'(a) < —1 


> 3. Rootfinding > 3.4 Fixed point iteration 
The results from the iteration for 
1 : 
g(x) =14+a- pt. g (a) = 0.106. 


along with the ratios Q — Xn 
TQ —————— 


(7.27) 


Q— Zi 


Empirically, the values of r„ converge to g'(a) = 0.105573, which agrees with 


lim b ue g (o) 

N— Co a 
n Ln Q — Tn Tn 
0 2.5 -2.64E-1 
1 2.25 -1.39E-2 | 0.0528 
2 2.2375 -1.43E-3 | 0.1028 
3 | 2.23621875 | -1.51E-4 | 0.1053 
4 | 2.23608389 | -1.59E-5 | 0.1055 
5 | 2.23606966 | -1.68E-6 | 0.1056 
6 | 2.23606815 | -1.77E-7 | 0.1056 
7 | 2.23606800 | -1.87E-8 | 0.1056 

Table: The iteration 4,4 = 1+ £n — tz? 


> 3. Rootfinding > 3.4 Fixed point iteration 


We need a more precise way to deal with the concept of the speed of 
convergence of an iteration method. 


Definition 
We say that a sequence {£n : n > 0} converges to o with an order of 
convergence p > 1 if 


|a — ze < ceļa — £n”, n20 
for some constant c > 0. 


The cases p = 1, p = 2, p = 3 are referred to as linear convergence, 
quadratic convergence and cubic convergence, respectively. 


ə Newton's method usually converges quadratically; and 


@ the secant method has a order of convergence p = Lys 


9 For linear convergence we make the additional requirement that 
c < 1; as otherwise, the error œ — £n need not converge to zero. 


> 3. Rootfinding > 3.4 Fixed point iteration 


e If g(a) < 1, then formula 
|o — tn41| < [9 (&)|lo — £n] 


shows that the iterates x,, are linearly convergent. 
ə If in addition g'(o) z 0, then formula 


[o — 2i] Ig Co)]]oc — znl 


proves that the convergence is exactly linear, with no higher order 
of convergence being possible. In this case, we call the value of 
g' (a) the linear rate of convergence. 


> 3. Rootfinding > 3.4 Fixed point iteration 


Theorem 2.8 


Assume g € C?(I,,) for some I, containing a, and 


g(a) = g" (a) =... =g V (a) 20, p>2. 
Then for xo close enough to a, £n — « and 
Q — Zn41 p-19 (a) 


EE (a — x4)? Se p! 


i.e. convergence is of order p. 


Proof: = Xn41i = glXn 


> 3. Rootfinding > 3.4 Fixed point iteration 


tei = oy = nes à n 20 

_ _ 1G), 

= g(Xn), g(zn) = x — Fæ 
g(x) = fae g(a) =0 
g(x) = 14 T gad "(q) = £4) 


(y P) 


Theorem 2.8 with p — 2: 


> 3. Rootfinding > 3.4 Fixed point iteration 


Ta] = Xn — B E g(x«). 


/ 
Need |g/(a)| < 1 for convergence: a = Fe) 


Linear convergence wit rate 1 — Po) (Thm 2.6.) 


If a = f'(xo) and zo is close enough to a, then 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


Recall 
Theorem 2.6 


Assuming linear convergence: g(a) £0. 
Derive an estimate for the error and use it to accelerate convergence. 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


Q — Tn = (Q — Z4) + (Ln-1 — Ln) (7.28) 
a — vy = g(a) — g(#n-1) 
= g'(&-1)(o — 2-1) 


Q — Tn-1 = EIE — Tn) (7.29) 
From (7.28)-(7.29) 
Q — Tn = = — Tn) + (Zn-1 — Tn) 


g'(£n-1) 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


Need an estimate for g'(a). 
Define 


a — In41 = g(a) — g(za) = g (En) (a = Ba) En EQ, n, mc. 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


(o — zu 1) — (a — £n) 
(a — $42) = (a — $51) 
_ (a — 1) — g'(En-1)(@ — 9-1) 
(o — 24—1)/9'(£n-2) — (a — z«—1) 


“Feo 


An = 


An > g(o) as£&,— a: àn & g'(o) 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


Aitken Error Formula 


From (7.30) 


A 
a S En + — À— (£a — t1) (7.31) 


Define 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


Repeat the example for 73. | 


The Table contains the differences £n — £n—1, the ratios Ap, and the 

estimated error from a — £n 7: TES (£n — Ln—1), given in the column 
Estimate. Compare the column Estimate with the error column in the 
previous Table. 


Ln Ln — X Àn Estimate 
2.5 
2.25 -2.50E-1 

2.2375 -1.25E-2 | 0.0500 | -6.58E-4 


2.23621875 | -1.28E-3 | 0.1025 | -1.46E-4 
2.23608389 | -1.35E-4 | 0.1053 | -1.59E-5 
2.23606966 | -1.42E-5 | 0.1055 | -1.68E-6 
2.23606815 | -1.50E-6 | 0.1056 | -1.77E-7 
2.23606800 | -1.59E-7 | 0.1056 | -1.87E-8 


Ny} DD} O71] eA GO] DO] IOS 


Table: The iteration 2,4, = 1 + Zn — tz? 


5 and Aitken Error Estimation 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


Given g, xo, €, root, assume |g'(a)| < 1 and £n — a linearly. 


Q x = g(z0). x D = g(21) 
Q 22— r9 + 75 45 (z2 = Zi) where Ay = 


22—21 
T1ı— to 


Q if |ĉ2 — x2| P € then root = 2»; exit 


Q set ro = 25, go to (1) 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


There are a number of reasons to perform theoretical error analyses of 
numerical method. We want to better understand the method, 


ə when it will perform well, 


e when it will perform poorly, and perhaps, 
9 when it may not work at all. 


With a mathematical proof, we convinced ourselves of the correctness 
of a numerical method under precisely stated hypotheses on the 
problem being solved. Finally, we often can improve on the 
performance of a numerical method. 

The use of the theorem to obtain the Aitken extrapolation formula is 
an illustration of the following: 


By understanding the behaviour of the error in a numerical method, it 
is often possible to improve on that method and to obtain another 
more rapidly convergent method. 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


f(a) =0 i fii) 
Tk+1 = Tk — alo k —0,1,... 


Q a, = f'(xx) = Newton's Method 

@ a, = ÎE) = fn) 
Tk — Xk—1 

@ ay = a —constant (e.g. aj = f'(xo)) = Parallel Chords Method 


hi) — 
© a, — {cet P4) — Flee) ys 9 = Finite Diff. Newton Method 


=> Secant Method 


hk 
If |hi| < c| f(x£k)|, then the convergence is quadratic. Need 


hy > hx võ 


> 3. Rootfinding > 3.4.1 Aitken Error Estimation and Extrapolation 


@ a, — =A Fler) 5. Steffensen Method. This is Finite 
Difference Method with hy = f (£k) = quadratic convergence. 
= OC where k’ is the largest index < k such that 
k—2Xg 


f(zx)f (xij) < 0 => Regula False 
Need xo, xı : f(zo)f(z1) < 0 


X1 — To 
m= a Fe Fa) Fao) 
T2 — XO 


> 3. Rootfinding > 3.4.2 High-Order Iteration Methods 


The convergence formula 
a — In41  g'(o)(a — an) 


gives less information in the case g'(o) — 0, although the convergence is 
clearly quite good. To improve on the results in the Theorem, consider the 
Taylor expansion of g(z,,) about a, assuming that g(x) is twice continuously 
differentiable: 


1 
grs) = g(a) + (£n — o)g' (a) + 5 (an — a)" g" (es) (7.33) 
with c, between x, and a. Using z441 = g(an), a = g(a), and g'(a) = 0, 
we have 1 
Zn+1 = Q + 5 n — ay?g" (cn) 
1 
Q — Zn] = -5(a = Tn)” 9" (cn) (7.34) 
. Q — In+1 _ E H 
i Gea a ae 


If g(a) Æ 0, then this formula shows that the iteration 7,41 = g(£n) is of 
order 2 or is quadratically convergent. 


> 3. Rootfinding > 3.4.2 High-Order Iteration Methods 


If also g" (o) = 0, and perhaps also some high-order derivatives are 
zero at a, then expand the Taylor series through higher-order terms in 
(7.33), until the final error term contains a derivative of g that is not 
nonzero at o. Thi leads to methods with an order of convergence 
greater than 2. 


As an example, consider Newton's method as a fixed-point iteration: 


= g(x x ae) 
Int = g( n); g( ) f(x) (7.36) 


Then, (a) f'a) 
1 E f(z)f"(x 
O= TOP 
and if f'(a) Æ 0, then 
g(a) = 0. 
Similarly, it can be shown that g’(a) 4 0 if moreover, f"(a) Z 0. If 


we use (7.35), these results shows that Newton's method is of order 2, 
,provided that f'(a) #0 and f(a) #0. 


e) 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


We will examine two classes of problems for which the methods of 
Sections 3.1 to 3.4 do not perform well. Often there is little that a 
numerical analyst can do to improve these problems, but one should be 
aware of their existence and of the reason for their ill-behaviour. 

We begin with functions that have a multiple root. The root o of f(x) 
is said to be of multiplicity m if 


f(v) =(e@-a)"W(x), h(a) #0 (7.37) 


for some continuous function h(x) with h(a) 4 0, m a positive 
integer. If we assume that f(x) is sufficiently differentiable, an 
equivalent definition is that 


fla) = fila) =--- = f™Vay=0, | f("(a)z0. (7.38) 


A root of multiplicity m = 1 is called a simple root. 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


Example. 
(a) f(x) = (a — 1)?(x + 2) has two roots. The root œ = 1 has 
multiplicity 2, and œ = —2 is a simple root. 


(b) f(z) = x? — 3x? + 3x — 1 has a= 1 as a root of multiplicity 3. 
To see this, note that 


The result follows from (7.38). 


(c) f(x) = 1 — cos(x) has a = 0 as a root of multiplicity m = 2. To 
see this, write 


zd = x h(x) 


T 


with h(0) = 4. The function h(x) is continuous for all x. 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


When the Newton and secant methods are applied to the calculation of a 
multiple root a, the convergence of a — £n to zero is much slower than it 
would be for simple root. In addition, there is a large interval of uncertainty 
as to where the root actually lies, because of the noise in evaluating f(x). 


The large interval of uncertainty for a multiple root is the most serious 
problem associated with numerically finding such a root. 


an 
10 
"ni 


E 

" 

[] n EE 
g 


s 
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> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


The noise in evaluating f(a) = (x — 1)?, which has a = 1 as a root of 
multiplicity 3. The graph also illustrates the large interval of uncertainty in 


finding a. 


To illustrate the effect of a multiple root on a rootfinding method, we use 
Newton's method to calculate the root œ = 1.1 of 


f(z) = (x-11)*(x- 2.1) 
2 2:951 ar z(-8.954 + £(10.56 + z(—5.4 + z))). (7.39) 


The computer used is decimal with six digits in the significand, and it uses 
rounding. The function f(x) is evaluated in the nested form of (7.39), and 
f'(x) is evaluated similarly. The results are given in the Table. 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


The column “ratio” gives the values of 
Q — Ln, 


(7.40) 


Q — In-1 


and we can see that these values equal about 2, 


Lin; flan) Qa—Z, | Ratio 
0.800000 | 0.03510 | 0.300000 
0.892857 | 0.01073 | 0.207143 | 0.690 
0.958176 | 0.00325 | 0.141824 | 0.685 
1.00344 | 0.00099 | 0.09656 | 0.681 
1.03486 | 0.00029 | 0.06514 | 0.675 
1.05581 | 0.00009 | 0.04419 | 0.678 
1.07028 | 0.00003 | 0.02972 | 0.673 
1.08092 0.0 0.01908 | 0.642 


ND) OW] BY] oan ejl olz 


Table: Newton's Method for (7.39) 
The iteration is linearly convergent with a rate of Z, 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


It is possible to show that when we use Newton's method to calculate 
a root of multiplicity m, the ratios (7.40) will approach 


fed wet (7.41) 
m 


Thus, as z,, approaches a, 


Q — In X Ala — Ln-1) (7.42) 


and the error decreases at about the constant rate. In our example, 
A= 2, since the root has multiplicity m = 3, which corresponds to the 
values in the last column of the table. The error formula (7.42) implies 
a much slower rate of convergence than is usual for Newton's method. 
With any root of multiplicity m > 2, the number A > 1; thus, the 
bisection method is always at least as fast as Newton's method for 
multiple roots. Of course, m must be an odd integer to have f(x) 
change sign at x = a, thus permitting the bisection method to be 
applied. 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


f (zx) 


PRET OR Fie) 


f(z) =(e@—a)Ph(x), p20 


Apply the fixed point iteration theorem 


f'(z) = p(z — o)! h(z) + (x — a)Ph' (2) 
(a — a)?h(x) 


g(z) = z — 


p(w — o)? T h(a) + (a — a)Ph' (x) 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


If p = 1 = g(a) = 0 Then by theorem 2.8 = quadratic convergence 


Tk+1— €  k—oo l 7 
(xj — a)? 2 (o) 


If p > 1 then by fixed point theory, theorem 2.6 = linear convergence 


p—1 


Iri — a| € |z — a]. 


Eg. p=2,2*=35. 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


f(a) = (a—a)Ph(a), ^ h(o)7 0. 


Assume p is known. 


— 
Choi = g(xk) 
_ f(a) 
g(z) 2z—p F(a) 
/ ELS 
g(a) = ; 0 


> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots 


Can run several Newton iterations to estimate p: 
a-2x —1 
look at | HP 
Q — Tk p 
One way to deal with uncertainties in multiple roots: 


=> a is a simple root for y(x). 


> 3. Rootfinding > Roots of polynomials 


p(x) =9 
p(x) = ao + a£ +... + ant”, 


Fundamental Theorem of Algebra: 


p(z) = an(z — z1)(z — 22) ... (£ — Zn), 


an F 0. 


21,...,24 € C. 


> 3. Rootfinding > Roots of polynomials 


1. Descarte’s rule of sign 

Real coefficients 
9 v = # changes in sign of coefficients (ignore zero coefficients) 
ə k = # positive roots 


ik sw 


and k—v_ is even. 


Example: p(x) = 2° + 2z* — 3a3 — 5a? — 1. 
v=1sSk<1sSk=0ork=1. 
v-k={ y k=0 not even 


i. 3 


> 3. Rootfinding > Roots of polynomials 


For negative roots consider q(x) = p(—2). 
Apply rule to q(x). 


Ex.: q(x) = —a° + 2z* + 32? — 52? — 1. 
v —2 
k= or 2. 


> 3. Rootfinding > Roots of polynomials 


ai 
IG] S 1 0 IERI an 


Book: Householder " The numerical treatment of single nonlinear 
equations”, 1970. 


Cauchy: given p(x), consider 


pi(z) = |as|z" + Be- +... + [ai|z — |ao| = 0 


po(x) = |as|a" — loe ette — ...— |a1|z — |ao| = 0 


By Descarte's: p; has a single positive root p; 


pı € |G] € e. 


> 3. Rootfinding > Roots of polynomials 


p(x) = ao + aaz + aga? +... + aaa" (7.43) 
p(x) = ao + z (a1 + z (ao + ...-- x (a4 1 + GnZ))...) (7.44) 


(7.44) requires n multiplications and n additions. 


k-1 
E Lex : lx 
(7.43) to form apq": ds de 


n+ and 2n— 1x. 


> 3. Rootfinding > Roots of polynomials 


For any C € R define bg, k = 0,...,n. 


bn = an 
bk = ak + Gb, k=n-l1l,n-2,...,0 


p(C) 2 ao - 6 | a1 +... +C (Qn_1+ an)... 
————— 


bn —1 
| ——— 
bi 
S ——— !É— 2 
bo 


> 3. Rootfinding > Roots of polynomials 


Consider 
q(x) = by + box +... + bnz”! 


P(x) = bo + (x — C)q(z). | 


Proof. 


bo + (x — C)a(x) 
= bo + (x — C)(b1 + bor +... + baa T) 


ea ee . + (bn bn) b bp z” 
—— a CS a ——" 
ao ay Qn—1 an 
= ag + a£ +... + anz” = p(x). m 


Note: if p(¢) = 0, then bo = 0: p(x) = (x — C)q(x). 


> 3. Rootfinding > Roots of polynomials 


If ¢ is found, continue with q(x) to find the rest of the roots. 


> 3. Rootfinding > Roots of polynomials 


p(zx) 
Tii = e O kO ee 
à p' (zx) 


To evaluate p and p' at x = C: 


P(C) = bo 
p (x) = q(x) + (x — ¢)q'(z) 
p'(C) = q(c) 


> 3. Rootfinding > Roots of polynomials 


Given: a = (ao, @1,-..-,@n) 
Output: b = b(bj, be,..., bn): coefficients of deflated polynomial g(x); 
root. 


Newton(a, n, £o, €, itmax, root, b, ierr) 


itnum = 1 


il, € mudo = Ope umm Go 
fr k=n-—1,...,1 Oa = Ci A oe c:=be +c p'(C) 
bo :— ao + Cbi p(¢) 
O if c — 0, iter = 2, exit pc) —0 
z1 = t0 — bo/c X1 = XO — p(xo)/p (zo) 


e if [xg — z1| € £, then ierr= 0: root = xv, exit 
9 it itnum - itmax, then ierr - 1, exit 
itnum = itnum + 1, r9 :— 21, 

quad go to 1. 


> 3. Rootfinding > Roots of polynomials 


p(x) = ao a1z +... + anz” 

roots: ¢1,...,Cn 
Perturbation polynomial q(x) = bo + biz +... + bne” 
Perturbed polynomial p(x, €) = p(x) + eq(z) 

= (ag + cbo) + (a4 + &bi)z +... + (an + ebp) x” 

roots: ¢;(€) - continuous functions of e, G;(0) = Gi. 

(Absolute) Conditioning number 
I5 (5) — Gal 


uL C 


> 3. Rootfinding > Roots of polynomials 


1/3 


Setty=x-—1 and a=e 
p(z,e) = y? — € = y? — à? = (y — a)(y? + ya + a°) 


Yaa 
= —a + y —3a2 E —a(1 + i3) 
72,3 9 9 
1—23V3 
——— CE 


> 3. Rootfinding > Roots of polynomials 


> 3. Rootfinding > Roots of polynomials 


p(x), Simple root ¢ 


p(c) — 0 
p (c) #0 
p(x, €) : root C(e) 


Ce) =C+ 3 ue" 
(=1 


= ¢ Hye + ye? +... 
“~~ ——— 


this is what matters negligeable if Eis small 


GE) = € 
€ 


miM XL mi yı 


—0oo 


> 3. Rootfinding > Roots of polynomials 


To find 71: 


(0) =" 
p(c(c),e) =0 
p(C(e)) + ea(C(e)) = 0 
v (C(&))C (e) + a(G(e)) ted’ (C(e))C'(e) = 
& —0 

(Qc) qt) 0 9 = - D 
p (c) € (0) +4(6) =0 => q= WO 

yı 
ENIM 
ke = [u| = val 


k is large if p'(C) is close to zero. 


> 3. Rootfinding > Systems of Nonlinear Equations 


If we denote 


fo (x) 


then (7.45) is equivalent to writing 


P(x) = 0; 


(7.45) 


(7.46) 


> 3. Rootfinding > Systems of Nonlinear Equations 


x = G(x), G: R” > R” 


Solution œ: œ = G(q) is called a fixed point of G. 
Example: F(x) = 0 
x=x-— AF(x) for some A € R™*™, nonsingular matrix. 


= G(x) 


Iteration: 


initial guess zo 
Xn+1 = G(x„), n=0,1,2,... 


> 3. Rootfinding > Systems of Nonlinear Equations 


Recall x € R™ 


m 1/p 
IIXIlp = n »  l£p«oo 


i=1 


|X|os = max [i]. 
<n 


1«i 


> 3. Rootfinding > Systems of Nonlinear Equations 


Recall x € R™ 


m 1/p 
I|XIlp = (X: «n , 1<p< œ, 
i—l 


I[XIloo = max |z: 
1«icn 


Matrix norms: operator induced 
A E€ Rmxm 


A 
|All = ees ee eae 
zER™ x40 IIXIlp 
(ils = eer: Rent 


1<i<m 


> 3. Rootfinding > Systems of Nonlinear Equations 


Let || - || be any norm in R”. 


Definition 


G : R” — R” is called a contractive mapping if 
|G(x) - G(y)l €&Alx— yl, — Vx;y € R”, 


for some A « 1. 


> 3. Rootfinding > Systems of Nonlinear Equations 


Theorem (Contractive mapping theorem) 


Assume 
Q D is a closed, bounded subset of R™. 
@ G: D— D is a contractive mapping. 

Then 
o J unique a € D such that a = G(a@) (unique fixed point). 


ə For any xy € D,x,41— G(x,,) converges linearly to a with rate A. 


> 3. Rootfinding > Systems of Nonlinear Equations 


We will show that ||x,,|| — o. 


lxi — xil = ||G(x;) — GG; ll S Allxi — xi-ill 
€... € Alx — xol| (by induction) 


k—1 k—1 
lx- xoll = || X xe — xs) € 3 lec — xil 


i=0 i=0 
k—1 Lk 
< 9 ~ lx — xol = t, lix: — xol 
i=0 
< — lx - xoll 


> 3. Rootfinding > Systems of Nonlinear Equations 


Vk, £: 


lx+ — xil = 1G, 49_1) 7 G(x, _ ll 
< Alx- — Xk-1|| 
€... € Mx; — xoll 
AE 
0 
«pape cl os 


=> {xn} is a Cauchy sequence > {xn} > a. 


> 3. Rootfinding > Systems of Nonlinear Equations 


Xn+1 = G(x,,) 
|n—oo 


a = G(a) 


— a is a fixed point. 
Uniqueness: Assume 3 = G(3) 


le- Sll = ||G(a) - G(B)|| € Alla — Bl, 
(1— A) la - All| < 0 = le - £l 20 e = B. 
—— 


20 


Linear convergence with rate A: 


||Xn-à = ed = lIGGo) - G(a@)|| < Allxas — all. 


> 3. Rootfinding > Systems of Nonlinear Equations 


Definition 
F : R™ — R” is continuously differentiable (F € C1(R™)) if, for every 


XER 
Ofi(x) = 
D ? oJ 


Ofi(x) Ofi(x) 
m Ox, a Ozm 
PO]: 
Ofm() O fm (x) 
Ox, Pu OZ qm, ase 


(F'(x));; = feo ie = Mess stb 


tg 


> 3. Rootfinding > Systems of Nonlinear Equations 


Theorem (Mean Value Theorem) 
Jf:R”—R, 


f(x) - fly) = VEz) (x — y) 


for some z € X, y, where V f(z) = 


Proof: Follows immediately from Taylor's theorem (linear Taylor 
expansion). Since 


Vf(2)'(x - y) = 2/0 


Qu = Yi) ae 


Oxy 


> 3. Rootfinding > Systems of Nonlinear Equations 


fil) 
F(x) = : : 

fm (x) 
filx) — fiy) = V fi(zi)! (x — y), (recie a. m 


> 3. Rootfinding > Systems of Nonlinear Equations 


Consider x = G(x), (Xn+1 = G(x,,)) with solution a = G(a). 


ag — (Xn+1)i = gi(e) — Gil xp) 
MYT vg) la- xn),  i-1...,m 


Vgi(zi)* 


Q — X441 = (o — Xp) Zj € OQ, Xa 


V9m(Zm)* 
Jn 
Q — X441 = J4(a — Xn) (7.47) 


Vgi(o)* 

If x, > a, Jn > f = G'(a). 
Vg (o) 

The size of G'(a) will affect convergence. 


> 3. Rootfinding > Systems of Nonlinear Equations 


Theorem 2.9 
Assume 
ə D is closed, bounded, convex subset of IR". 
ə GeC!(D) 
e G(D)CD 
ə A= maxxep ||G'(x)||oo < 1. 
Then 
(i) x = G(x) has a unique solution a € D 
(ii) vxo € D, Xn+1 = G(x„) converges to ar 
(iii) Ja — Xn4alloo < (||G"(@)|loo + En) | — Xnlloo, 


whenever €n —> 0. 
TL— OO 


> 3. Rootfinding > Systems of Nonlinear Equations 


Proof: Vx,y E D 


lgi(x) — gi(y)| € Valz (x -y), mexy 


Ogi(zi) | Ogi (zi) 
= (xj—wj) € Iz; — yi 
2. Os 4 d 2 Ox; l ? 
^ | Ogi(zi 
«Y: [2869 i — vi, < Qe lix — yl 
= is 
j=l 
> [G(x) - G()llss € IIG'(zi)llsellx — yllo S Allx — yll 
— G is a contractive mapping. = (i) and (ii). 


To show (iii), from (7.47): 


|: = xallee € IJnllosl|e — xnlloc 


< (Jn - G'teDll- IG (o) loo) le — Xnlloo- m 
———— 


En — 0 
TL,—oo 


3. Rootfinding > Systems of Nonlinear Equations 


322 + 42% -1=0 


, for a near (x1, 23) = (—.5,.25). 


—.17 ria +425, — 1 
dy 8r], —1 


—.26 


> 3. Rootfinding > Systems of Nonlinear Equations 


X541 = Xn — AF (xn) 
——— 


G(x) 
IG" (e) |loo ~ 0.04, le—-x*»ile 404 AE (F'(xo)) 
| — xn llo 
Why? 
G'(x) = I — AF'(x), G'(a) 2 I — AF'(a) 
Need 


IG (o) los ~ 0 
Acm(F(o) ,  Ac(F(x)) 


—1 


X444 = Xn — (F'(xo)) | F(xn) 


> 3. Rootfinding > Systems of Nonlinear Equations 


X4 = Xn — (F'(xn)) |F(x4, | 2=0,1,2,... 
Given initial guess: 
fi(x) = fi(xo) + V fixo)" (x — xo) + O (Ix — x0) II”) 
—— 


neglect 


F(x) ~ F(x,) + F'(xo)(x — xo) = Mo(x) 


Mo(x): linear model of F(x) around xo. 
Set xı: Mo(x1) = 0 


F(x,) -+ F'(xo) (x1 — X9) —0 
X1 = Xo — (F'(xo)) ^ F(x,) 


> 3. Rootfinding > Systems of Nonlinear Equations 


In general, Newton's method: 
- 
Xn+1 = Xn — (F'(x4)) F(x,,) 
Geometric interpretation: 


mi(x) = fi(xo) + Vfi(xo) (x xo), t= 1,...,m 
fi(x) : surface 


mi(x) : tangent at xo. 


In practice: 


Q Solve a linear system F'(xn)ôn = —F(x,,) 
Q Set x41 = Xn + Ôn 


> 3. Rootfinding > Systems of Nonlinear Equations 


F(x) =o) 


x =x- (F(x) F(x) = G(x), xn = G(xn) 


Assume F(a) = 0, F'(o) is nonsingular. 
Then G'(a) = 0 (exercise !) 
|G") loo = 0 


If G’ € C! (B.(a)) where B,(o) = {y : |y — al] < r}, by continuity: 
[Geile «1 fore B.) 


for some T. 
By Theorem 2.9 D = B; => linear convergence. 


> 3. Rootfinding > Systems of Nonlinear Equations 


Assume 


o F’ € C!(D) 

ə Ja € D such that F(a) = 0 

o F'(o) € Lip,(D) 

e 3 (F(o)) and |F'(o)|* x 8 
Then de > 0 such that if ||xo — al| < e, 
S xia xs - (Fx) F(x) > o and [xil < Bylina. 


(Gy: measure of nonlinearity) 


1 
So, need £ < a» 


Reference: Dennis & Schnabel, SIAM. 


> 3. Rootfinding > Systems of Nonlinear Equations 


Xn+1 = Xn A, F(x, ), An © F'(x,) 
Ex.: Finite Difference Newton 


_ fil&n + hrej) — fi(xn) || OFi(Xn) 


Jj , 
An On; 


hn ~% V6 and ej = (0...0 ! 0...0)7. 


th 


j“ position 


> 3. Rootfinding > Systems of Nonlinear Equations 


Newton's method: 

Xn41 = Xn + Sndn 

when 

dn zz -(F(x,)) !F(x,) 

E 

If Newton step s, not satisfactory, e.g. ||F(x,,,,)|l2 > ||F(x,,)|l2 
Sn *€— Sn for some g < 1 (backtracking) 

We can choose s,, such that 


p(s) = FG, + sdn)||2 


am, iS minimized. Line Search 


> 3. Rootfinding > Systems of Nonlinear Equations 


In practice: minimize a quadratic model of y(s). 


Trust region 


Set a region in which the model of the function is reliable. If Newton 
step takes us outside this region, cut it to be inside the region 


(See Optimization Toolbox of Matlab) 


> 3. Rootfinding > Matlab’s function fsolve 


The MATLAB instruction 
zero = fsolve('fun', x0) 
allows the computation of one zero of a nonlinear system 


fi(21, £2, ies Un) =0 
fo(@1, £2, $e sux) = 0, 


fn(£1, £2, ttt Zn) = 0, 


defined through the user function fun starting from the vector x0 as initial 
guess. 

The function fun returns the n values fi(x),..., f, (x) for any value of the 
input vector zx. 


> 3. Rootfinding > Matlab’s function fsolve 


For instance, let us consider the following system: 


r? ES y? = 1, 
sin(12/2) + y? =0, 


whose solutions are (0.4761, -0.8794) and (-0.4761,0.8794). 
The corresponding Matlab user function, called systemnl, is defined as: 


function fx-systemnl(x) 
fx = [x(1) 2*x(2) 2-1; 
sin(pi*0.5*x(1) )+x(2)°3;] 

The Matlab instructions to solve this system are therefore: 
SS x0 = [I]; 
>> options—optimset(' Display','iter'); 
>> [alpha,fval] = fsolve('systemnl',xO,options) 
alpha = 

0.4761 -0.8794 


Using this procedure we have found only one of the two roots. The other can 
be computed starting from the initial datum -x0. 


> 3. Rootfinding > — Unconstrained Minimization 


f(z1,29,...,2,) : R” > R; 


min f(x) 


Theorem (first order necessary condition for a minimizer) 


If f € Ct(D), D C R”, and x € D is a local minimizer then 
NUS 


Solve: 


Vf(x) 20wihVf£-| : |. (7.48) 


> 3. Rootfinding > — Unconstrained Minimization 


Apply Newton's method for (7.48) with F(x) = V f(x). 
Need F'(x) = V?f(x) = H(x). 

NRI 

WT ÓrjÓr; 


Xii = Xn — H(xn) !V f (xn) 


If H(a) is nonsingular and is Lip}, then xn — « quadratically. 
Problems: 


@ Not globally convergent. 
@ Requires solving a linear system each iteration. 
@ Requires V f and H. 


Q May not converge to a minimum. 


Could converge to a maximum or saddle point. 


> 3. Rootfinding > — Unconstrained Minimization 


@ Globalization strategy (Line Search, Trust Region). 
@ Secant Approximation to H. 
@ Finite Difference derivatives for V f not for H. 


Theorem (necessary and sufficient conditions for a minimizer): 


Q Assume f € C?(D), D c R?, dx € D such that V f(x) =0. Then x 
is a local minimum if and only if H(x) is symmetric positive 
semidefinite (vTHv >0 Vv € R”) 


f(x) < f(y) 


x is a local minimum for Vy € B(x). 


> 3. Rootfinding > — Unconstrained Minimization 


Newton's method: 


X444 such that Verg 1444) = 0 


> 


3. Rootfinding > Unconstrained Minimization 


We need to guarantee that Hessian of the quadratic model is 
symmetric positive definite 


V?m, (xj) = H(x,,). 


Modify 
Xn+1 = Xn — H^ (xn) V (Xn) 


where 


H(x,) = H(xn) + us, 


for some un > 0. is z - 
If Ay,..-,An eigenvalues of H(x,,) and A1,..., Àn eigenvalues of H. 


Xi = Ài + ln 


Need: Ln: Amm + Hn > 0. 
Ghershgorin Theorem: A = (aij), eigenvalues lie in circles with centers 
aj; and radius r = 577 jz; |aij]. 


ALA 


> 3. Rootfinding > — Unconstrained Minimization 


> 3. Rootfinding > Unconstrained Minimization 


Xn+1 = Xn Hox civi) 


Definition 


d is a descent direction for f(x) at point xo if 
f(xo) > f(xo + ad) for 0 € a < ao 


d is descent direction if and only if V f (xo)? d < 0. 


Newton: X544 = Xn + dn, dn = —H(x,) 1V f (xn). 
dn is a descent direction if H(x,,) is symmetric positive definite. 


Vi(Xn) dn = -V f (Xn) H(x,) !V f(x.) «0 


since H(x,,) is symmetric positive definite. 


A 
a 
\ 
TEM 
9 


> 3. Rootfinding > Unconstrained Minimization 


Xn+1 = Xn + Sndn, d, = -V f (Xn), Sn = min f (Xn + sdn) 


Level curve: C = {x| f(x) = f(xo)}. 
If C is closed and contains œ in the interior, then the method of 
steepest descent converges to œ. Convergence is linear. 


> 3. Rootfinding > — Unconstrained Minimization 


Weaknesses of gradient descent 


@ The algorithm can take many iterations to converge towards a local 
minimum, if the curvature in different directions is very different. 


Q Finding the optimal s,, per step can be time-consuming. Conversely, 
using a fixed sn can yield poor results. Methods based on Newton's 
method and inversion of the Hessian using conjugate gradient 
techniques are often a better alternative. 


